我们开发了一种通用机制,用于从概率的驾驶行为基础模型中生成车辆型特定路线序列。许多基础行为模型都经过了不包括车辆信息的数据培训,这些数据限制了其在下游应用程序(例如计划)中的实用性。我们的新方法有条件地将这种行为预测模型专门为媒介物类型,通过利用用于生产特定车辆控制器的增强学习算法的副产品。我们展示了如何使用通用的概率行为模型组成车辆特定的价值函数估计,以生成车辆型特定的路线序列,而这些序列序列更可能在物理上是可行的,而不是其车辆敏捷的序列。
translated by 谷歌翻译
我们考虑在线模仿学习(OIL),其中的任务是找到一项通过与环境的积极互动来模仿专家的行为的政策。我们旨在通过分析最流行的石油算法之一匕首来弥合石油政策优化算法之间的差距。具体而言,如果一类政策足以包含专家政策,我们证明匕首会持续遗憾。与以前需要损失的界限不同,我们的结果只需要较弱的假设,即损失相对于策略的足够统计数据(而不是其参数化)。为了确保对更广泛的政策和损失类别的收敛,我们以额外的正则化项增强了匕首。特别是,我们提出了一个遵循定制领导者(FTRL)的变体及其用于石油的自适应变体,并开发了与FTL的内存需求相匹配的记忆效率实现。假设损失的功能是平稳的,并且相对于政策参数凸出,我们还证明,FTRL对任何足够表达的政策类别都持续遗憾,同时保留了$ O(\ sqrt {t})$,在最坏的情况下遗憾案子。我们通过实验对合成和高维控制任务的实验证明了这些算法的有效性。
translated by 谷歌翻译
我们提出了具有可拖动的对数密度的集合数据值数据的新型,有条件的生成概率模型。该模型是由置换模化动力学控制的连续归一化流。这些动力学是由可学习的每集元素项和成对相互作用的驱动的,均通过深神经网络参数化。我们通过应用程序说明了该模型的实用性,包括(1)以视觉上指定的地图信息为条件的复杂交通场景生成,以及(2)直接在图像上调节的对象边界框生成。我们借助罚款,可确保动力学平稳并因此有效解决,我们通过最大程度地提高标记有条件数据标记的条件数据的预期可能性来训练我们的模型。我们的方法在对数的可能性和特定于域特异性指标(越野,碰撞和违规违规)方面极大地超过了非渗透不变基线,从而产生了很难与真实数据区分的现实样本。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naive approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL sufficiently matches the expert distribution without fully learning the desired task. This can be particularly catastrophic for manipulation tasks, where the difference between an expert and a non-expert state-action pair is often subtle. We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of multiple exploratory, auxiliary tasks in addition to a main task. The addition of these auxiliary tasks forces the agent to explore states and actions that standard AIL may learn to ignore. Additionally, this particular formulation allows for the reusability of expert data between main tasks. Our experimental results in a challenging multitask robotic manipulation domain indicate that LfGP significantly outperforms both AIL and behaviour cloning, while also being more expert sample efficient than these baselines. To explain this performance gap, we provide further analysis of a toy problem that highlights the coupling between a local maximum and poor exploration, and also visualize the differences between the learned models from AIL and LfGP.
translated by 谷歌翻译
Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.
translated by 谷歌翻译
The Covid-19 pandemic induced a vast increase in adolescents diagnosed with eating disorders and hospitalized due to eating disorders. This immense growth stemmed partially from the stress of the pandemic but also from increased exposure to content that promotes eating disorders via social media, which, within the last decade, has become plagued by pro-eating disorder content. This study aimed to create a deep learning model capable of determining whether a given social media post promotes eating disorders based solely on image data. Tweets from hashtags that have been documented to promote eating disorders along with tweets from unrelated hashtags were collected. After prepossessing, these images were labeled as either pro-eating disorder or not based on which Twitter hashtag they were scraped from. Several deep-learning models were trained on the scraped dataset and were evaluated based on their accuracy, F1 score, precision, and recall. Ultimately, the vision transformer model was determined to be the most accurate, attaining an F1 score of 0.877 and an accuracy of 86.7% on the test set. The model, which was applied to unlabeled Twitter image data scraped from "#selfie", uncovered seasonal fluctuations in the relative abundance of pro-eating disorder content, which reached its peak in the summertime. These fluctuations correspond not only to the seasons, but also to stressors, such as the Covid-19 pandemic. Moreover, the Twitter image data indicated that the relative amount of pro-eating disorder content has been steadily rising over the last five years and is likely to continue increasing in the future.
translated by 谷歌翻译
We introduce a pivot for exact selective inference with randomization. Not only does our pivot lead to exact inference in Gaussian regression models, but it is also available in closed form. We reduce the problem of exact selective inference to a bivariate truncated Gaussian distribution. By doing so, we give up some power that is achieved with approximate inference in Panigrahi and Taylor (2022). Yet we always produce narrower confidence intervals than a closely related data-splitting procedure. For popular instances of Gaussian regression, this price -- in terms of power -- in exchange for exact selective inference is demonstrated in simulated experiments and in an HIV drug resistance analysis.
translated by 谷歌翻译
Using geometric landmarks like lines and planes can increase navigation accuracy and decrease map storage requirements compared to commonly-used LiDAR point cloud maps. However, landmark-based registration for applications like loop closure detection is challenging because a reliable initial guess is not available. Global landmark matching has been investigated in the literature, but these methods typically use ad hoc representations of 3D line and plane landmarks that are not invariant to large viewpoint changes, resulting in incorrect matches and high registration error. To address this issue, we adopt the affine Grassmannian manifold to represent 3D lines and planes and prove that the distance between two landmarks is invariant to rotation and translation if a shift operation is performed before applying the Grassmannian metric. This invariance property enables the use of our graph-based data association framework for identifying landmark matches that can subsequently be used for registration in the least-squares sense. Evaluated on a challenging landmark matching and registration task using publicly-available LiDAR datasets, our approach yields a 1.7x and 3.5x improvement in successful registrations compared to methods that use viewpoint-dependent centroid and "closest point" representations, respectively.
translated by 谷歌翻译
Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for initialization. Such solvers are often embedded into larger scientific models or analyses with a downstream application such that error quantification plays a key role. However, by entirely ignoring parameter and measurement uncertainty, classical PDE solvers may fail to produce consistent estimates of their inherent approximation error. In this work, we approach this problem in a principled fashion by interpreting solving linear PDEs as physics-informed Gaussian process (GP) regression. Our framework is based on a key generalization of a widely-applied theorem for conditioning GPs on a finite number of direct observations to observations made via an arbitrary bounded linear operator. Crucially, this probabilistic viewpoint allows to (1) quantify the inherent discretization error; (2) propagate uncertainty about the model parameters to the solution; and (3) condition on noisy measurements. Demonstrating the strength of this formulation, we prove that it strictly generalizes methods of weighted residuals, a central class of PDE solvers including collocation, finite volume, pseudospectral, and (generalized) Galerkin methods such as finite element and spectral methods. This class can thus be directly equipped with a structured error estimate and the capability to incorporate uncertain model parameters and observations. In summary, our results enable the seamless integration of mechanistic models as modular building blocks into probabilistic models.
translated by 谷歌翻译